Building Effective Agents

Simple, Composable Patterns for LLM Systems

Over the past year, we've worked with dozens of teams building large language model (LLM) agents across industries. Consistently, the most successful implementations weren't using complex frameworks or specialized libraries. Instead, they were building with simple, composable patterns.

In this presentation, we share what we've learned from working with customers and building agents ourselves, and give practical advice for developers on building effective agents.

What are agents?

"Agent" can be defined in several ways. At Anthropic, we categorize all these variations as agentic systems, but draw an important architectural distinction:

Workflows

Systems where LLMs and tools are orchestrated through predefined code paths.

Agents

Systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.

Below, we will explore both types of agentic systems in detail.

When (and when not) to use agents

When building applications with LLMs, we recommend finding the simplest solution possible, and only increasing complexity when needed. This might mean not building agentic systems at all.

Considerations

Agentic systems often trade latency and cost for better task performance
Workflows offer predictability and consistency for well-defined tasks
Agents are better when flexibility and model-driven decision-making are needed at scale
For many applications, optimizing single LLM calls with retrieval and in-context examples is usually enough

When and how to use frameworks

There are many frameworks that make agentic systems easier to implement, including:

LangGraph from LangChain
Amazon Bedrock's AI Agent framework
Rivet, a drag and drop GUI LLM workflow builder
Vellum, another GUI tool for building and testing complex workflows

Our recommendation

Start by using LLM APIs directly
Many patterns can be implemented in a few lines of code
If you use a framework, ensure you understand the underlying code

Building block: The augmented LLM

The basic building block of agentic systems is an LLM enhanced with augmentations such as retrieval, tools, and memory. Our current models can actively use these capabilities—generating their own search queries, selecting appropriate tools, and determining what information to retain.

We recommend focusing on two key aspects: tailoring these capabilities to your specific use case and ensuring they provide an easy, well-documented interface for your LLM.

Workflow: Prompt chaining

Prompt chaining decomposes a task into a sequence of steps, where each LLM call processes the output of the previous one. You can add programmatic checks on any intermediate steps to ensure that the process is still on track.

When to use this workflow

Ideal for situations where the task can be easily and cleanly decomposed into fixed subtasks. The main goal is to trade off latency for higher accuracy, by making each LLM call an easier task.

Workflow: Routing

Routing classifies an input and directs it to a specialized followup task. This workflow allows for separation of concerns, and building more specialized prompts.

When to use this workflow

Works well for complex tasks where there are distinct categories that are better handled separately, and where classification can be handled accurately.

Examples

Directing different types of customer service queries to different downstream processes
Routing easy questions to smaller models and hard questions to more capable models

Workflow: Parallelization

LLMs can sometimes work simultaneously on a task and have their outputs aggregated programmatically. This workflow manifests in two key variations:

Sectioning

Breaking a task into independent subtasks run in parallel.

Voting

Running the same task multiple times to get diverse outputs.

When to use this workflow

Effective when subtasks can be parallelized for speed, or when multiple perspectives are needed for higher confidence results.

Workflow: Orchestrator-workers

In the orchestrator-workers workflow, a central LLM dynamically breaks down tasks, delegates them to worker LLMs, and synthesizes their results.

When to use this workflow

Well-suited for complex tasks where you can't predict the subtasks needed. The key difference from parallelization is its flexibility—subtasks aren't pre-defined, but determined by the orchestrator.

Example

Coding products that make complex changes to multiple files each time.

Workflow: Evaluator-optimizer

In the evaluator-optimizer workflow, one LLM call generates a response while another provides evaluation and feedback in a loop.

When to use this workflow

Particularly effective when we have clear evaluation criteria, and when iterative refinement provides measurable value. This is analogous to the iterative writing process a human writer might go through.

Examples

Literary translation where nuances might not be captured initially
Complex search tasks requiring multiple rounds of searching and analysis

Agents

Agents are emerging in production as LLMs mature in key capabilities—understanding complex inputs, engaging in reasoning and planning, using tools reliably, and recovering from errors.

When to use agents

For open-ended problems where it's difficult to predict the required number of steps
When you can't hardcode a fixed path
When you must have some level of trust in the LLM's decision-making
For scaling tasks in trusted environments

Note: The autonomous nature of agents means higher costs, and the potential for compounding errors. We recommend extensive testing in sandboxed environments.

Combining and customizing these patterns

These building blocks aren't prescriptive. They're common patterns that developers can shape and combine to fit different use cases.

Key to success

As with any LLM features, measuring performance and iterating on implementations is crucial. You should consider adding complexity only when it demonstrably improves outcomes.

Frameworks can help you get started quickly, but don't hesitate to reduce abstraction layers and build with basic components as you move to production.

Summary

Success in the LLM space isn't about building the most sophisticated system. It's about building the right system for your needs.

Our approach

Start with simple prompts
Optimize them with comprehensive evaluation
Add multi-step agentic systems only when simpler solutions fall short

Three core principles for implementing agents

Maintain simplicity in your agent's design
Prioritize transparency by explicitly showing the agent's planning steps
Carefully craft your agent-computer interface (ACI) through thorough tool documentation and testing

Appendix 1: Agents in practice

Our work with customers has revealed two particularly promising applications for AI agents that demonstrate the practical value of the patterns discussed above.

Customer support

A natural fit for more open-ended agents because:

Support interactions naturally follow a conversation flow
Tools can pull customer data and order history
Actions like issuing refunds can be handled programmatically
Success can be clearly measured through user-defined resolutions

Coding agents

Particularly effective because:

Code solutions are verifiable through automated tests
Agents can iterate using test results as feedback
The problem space is well-defined and structured
Output quality can be measured objectively

Appendix 2: Prompt engineering your tools

No matter which agentic system you're building, tools will likely be an important part of your agent. Tool definitions and specifications should be given just as much prompt engineering attention as your overall prompts.

Our suggestions for deciding on tool formats

Give the model enough tokens to "think" before it writes itself into a corner
Keep the format close to what the model has seen naturally occurring in text
Make sure there's no formatting "overhead" such as having to keep accurate counts

Creating good agent-computer interfaces (ACI)

Put yourself in the model's shoes - is it obvious how to use this tool?
Change parameter names or descriptions to make things more obvious
Test how the model uses your tools and iterate
Poka-yoke your tools - make them harder to use incorrectly